Learning to Compress Ergodic Sources

نویسندگان

  • Jonathan Baxter
  • John Shawe-Taylor
چکیده

We present an adaptive coding technique which is shown to achieve the optimal coding in the limit as the size of the text grows, while the data structures associated with the code only grow linearly with the text. The approach relies on Huffman codes which are generated relative to the context in which a particular character occurs. The Huffman codes themselves are inferred from the data that has already been seen. A key part of the paper involves showing that the loss per character incurred by the learning process tends to zero as the size of the text tends to infinity. This involves an analysis in an on-line learning framework bounding the cumulative loss, where loss is defined to be the excess code length. By using the Bayes prediction distribution and code the expected loss per character converges to zero at the best possible rate of O(log n/n). By allowing the length of contexts to grow in response to commonly occurring subsequences, the coding is efficient precisely where it needs to be, hence achieving a high compression rate at a relatively low overhead in terms of data structure storage. *Joint affiliation with Department of Mathematics, London School of Economics and Department of Computer Science, Royal Holloway, University of London 1068-0314/96$5.0001996IEEE 423 Proceedings of the 1996 Data Compression Conference (DCC) 1068-0314/96 $10.00 © 1996 IEEE

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Nonparametric Estimation and On-Line Prediction for General Stationary Ergodic Sources

We propose a learning algorithm for nonparametric estimation and on-line prediction for general stationary ergodic sources. The idea is to prapare many histograms and estimate the probability distribution of the bins in each histogarm. We do not know a priori which histogram expresses the true distribution: if the histogram is too sharp, the estimation captures the noise too much (overestimatio...

متن کامل

The ergodic decomposition of asymptotically mean stationary random sources

It is demonstrated how to represent asymptotically mean stationary (AMS) random sources with values in standard spaces as mixtures of ergodic AMS sources. This an extension of the well known decomposition of stationary sources which has facilitated the generalization of prominent source coding theorems to arbitrary, not necessarily ergodic, stationary sources. Asymptotic mean stationarity gener...

متن کامل

Individual ergodic theorem for intuitionistic fuzzy observables using intuitionistic fuzzy state

The classical ergodic theory hasbeen built on σ-algebras. Later the Individual ergodictheorem was studied on more general structures like MV-algebrasand quantum structures. The aim of this paper is to formulate theIndividual ergodic theorem for intuitionistic fuzzy observablesusing  m-almost everywhere convergence, where  m...

متن کامل

cient Lossless Compression of Trees and Graphs

In this paper, we study the problem of compressing a data structure (e.g. tree, undirected and directed graphs) in an eecient way while keeping a similar structure in the compressed form. To date, there has been no proven optimal algorithm for this problem. We use the idea of building LZW tree in LZW compression to compress a binary tree generated by a stationary ergodic source in an optimal ma...

متن کامل

عنوان : Comparing the effect of warm moist compress and Calendula ointment on the severity of phlebitis caused by 50% dextrose infusion: A clinical trial

چکیده: Background: One of the important hypertonic solutions is 50% dextrose. Phlebitis is the most common complication of this solution, the management of which is quite necessary. Regarding this, the present study aimed to compare the effect of warm moist compress and Calendula ointment on the severity of phlebitis caused by 50% dextrose infusion. Methods: This clinical trial was conducted on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996